Journal article
Compact inverted index storage using general-purpose compression libraries
M Petri, A Moffat
Software Practice and Experience | WILEY | Published : 2018
DOI: 10.1002/spe.2556
Abstract
Efficient storage of large inverted indexes is one of the key technologies that support current web search services. Here we re-examine mechanisms for representing document-level inverted indexes and within-document term frequencies, including comparing specialized methods developed for this task against recent fast implementations of general-purpose adaptive compression techniques. Experiments with the Gov2-URL collection and a large collection of crawled news stories show that standard compression libraries can provide compression effectiveness as good as or better than previous methods, with decoding rates only moderately slower than reference implementations of those tailored approaches...
View full abstractGrants
Awarded by Australian Research Council
Funding Acknowledgements
Australian Research Council Discovery Projects Scheme, Grant/Award Number: DP140103256